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SYSTEM AND METHOD FOR DETERMINING A COMPUTER USER PROFILE FROM 
A MOTION— BASED IN PUT DEVICE 

Field of the Invention: 



The invention relates to user profiling of computers based on behavioral 
biometrics. More specifically, the invention relates to mouse and keystroke- 
based computer user profiling for security purposes. 

Background of the Invention: 

The increasing reliance of modern societies and economies on computing 
infrastructures raises the needs of highly secure and dependable computing 
technologies. Recent widely publicized security incidents such as the 
stammer worm have established how vulnerable several critical aspects of 
our social and economical life have become because of increased 
computerization. 

Computer security has also become increasingly important because of the 
large number of security breaches in individual businesses, and the cost of 
those breaches to the businesses. In a recent survey (2003), it was reported 
that the total annual financial losses to the respondents were $201,797,340. 
This figure could actually be worse since only 251 out of the 530 participants 
(47%) reported their losses. The survey also shows other compelling 
statistics: 92% of the respondents detected attacks during the last 12 months 
while 75% of the respondents acknowledged financial losses due to security 
breaches. As mentioned above, only 47% reported their losses. 

Many organizations address security from three different perspectives- 
prevention, detection, and reaction. Apparently, 99% of the respondents to 
a survey use a mixture of various technologies to protect their systems For 
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example, more than 90% use prevention technologies such as firewall, 
access control, and physical security. Also, 73% use intrusion detection 
systems. 

One form of protection is password protection. It is a well-established fact 
that traditional passwords are not safe anymore. Passwords may be stolen 
or may be cracked using the so-called dictionary attack. 

Another technology used by corporations to protect their networks is 
firewalls. Firewall technology has been used to protect and isolate segments 
of networks from untrusted networks by filtering out harmful traffic. There 
are several limitations to firewall technologies that result in them being 
relatively poor choices for strong network p rotection. T here have been 
several widely publicized exploits whereby hackers have gained access to 
sensitive data by tunneling through authorized protocols. In order to provide 
a higher level of security, most organizations combine firewalls with a range 
of security monitoring tools called intrusion detection systems (IDS). 

Intrusion Detection 

The role of IDS is to monitor and detect computer and network intrusions in 
order to take appropriate measures that would prevent or avoid the 
consequences. The Internet is a wild zone, where new forms of security 
attacks are developed and executed daily. Hence, the main challenge 
currently faced by IDS technology is to be able to detect new forms of 
attacks. 

An intrusion is described as a violation of the security policy of the system. It 
is also described as any set of actions that attempt to compromise the 
integrity, confidentiality, or availability of a resource. 
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There are three types of intrusion detection systems, anomaly intrusion 
detection, misuse intrusion detection, and specification based detection 
Anomaly detection refers to intrusions that can be detected based on 
anomalous activity and use of resources. Misuse detection refers to 
intrusions that follow well defined patterns of attack. Specification-based 
detection approaches consider that all well-behaved system executions shall 
conform precisely to programs specifications 

Existing anomaly detection techniques attempt to establish normal activity 
profile using statistical modeling. Statistical profile-based detection uses a 
set of metrics to compute some measurements of user activity, and 
compares them against a set of values that characterize normal user activity 
Any discrepancy between the computed values and the expected ones is 
considered an intrusion. Anomaly detection techniques to date rely upon a 
measured activity. These tend to be an activity in response to an input and 
therefore rely very heavily upon the constancy of the input. For example the 
number of emails opened in a day may be measured. This, of course is 
highly dependent upon the number of emails received. 

Anomaly detection techniques assume that all intrusive activities are 
necessarily anomalous. This means that if we could establish a normal 
activity profile for a system, we could, in theory, flag all system states varying 
from the established profile by statistically significant amounts as intrusion 
attempts. However, if we consider that the set of intrusive activities only 
intersects the set of anomalous activities instead of being exactly the same 
we will find the following possibilities: 

1. Anomalous activities that are not intrusive are flagged as intrusive, 
(false positives); and 

2. Intrusive activities that are not anomalous (false negatives) 
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False negatives are considered very dangerous, and are far more serious 
than the issue raised by false positives. 

The main issues in existing anomaly detection systems are the selection of 
threshold levels so that neither of the above two problems is unreasonably 
magnified, and the selection of available features to monitor. The features 
should effectively discriminate between intrusive and non intrusive 
behaviors. The existing anomaly detection systems are also computationally 
expensive because of the overhead of keeping track of, and possibly 
updating several system profile metrics. 

The concept behind misuse detection schemes is that there are ways to 
represent attacks in the form of a pattern or a signature so that even 
variations of the same attack can be detected. Misuse detection systems 
can detect many or all known attack patterns, but they are of little use for as 
yet unknown attack methods. 

Specification-based intrusion detection consists of checking whether a 
certain execution sequence violates the specification of programs that may 
affect the system protection state. Specification-based detection has the 
potential to detect unknown attacks, however it is still in its infancy. 

Existing intrusion detectors are characterized by significantly high false 
alarm rates. This is mainly a result of the low accuracy of the profiles 
computed. For example, some anomaly detectors base users' profiles on 
metrics such as the average number of files opened or emails sent daily. It 
is easy to find several users sharing the same habits. Further, it is easy for 
any user to change his habits and adopts the usage pattern of other users! 

Biometrics systems: 
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Different types of biometrics identification systems are currently available in 
the market, and are widely used in various security applications. Biometrics 
can be classified into two categories, "physiological biometrics" and 
"behavioral biometrics". Physiological biometrics, including finger-scan iris- 
scan, retina-scan, hand-scan, and facial-scan uses measurements from the 
human body. Behavioral biometrics, such as signature or keystroke 
dynamics, uses measurements based on human actions. Published 
benchmark testing data for existing technologies shows that false rejection 
rates vary from 6% for face recognition to 0.25% for iris scan, whereas false 
acceptance rates vary from 6% for face recognition to 0.0001o/ 0 for iris scan 
Behavioral biometrics systems have experienced less success when 
compared to physiological systems because of variability in the measured 
parameter over time. However, either system provides improvements over 
the traditional intrusion detection systems. 

Traditional intrusion detection systems focus on the actions conducted by 
the user. Biometrics-based systems focus on the identity of the user hence 
such systems are able to detect the type of intrusion where an attacker gains 
access to the resources and starts to perform normal non-intrusive 
procedures, causing information leakage or any other vulnerabilities 
Drfferences in usage pattern cannot be detected by traditional intrusion 
detection systems if the attacker knows the operation sequences and his 
access limits. Such an attack, however, can be uncovered if the detection is 
based on biometrics information. 

In recent years there has been increasing interest in biometrics systems 
The Oxford dictionary definition of biometrics is "application of statistical 
analysis to biological data". ,n the field of computer security, biometrics is 
defined as the automated use of a collection of factors describing human 
behavioral or physiological characteristics to establish or verify a precise 
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Biometrics systems operate In two modes, the enrollment mode and the 
verifioation/identmoatlon mode. In the first mode, biometrics data is acquired 
using a user interface or a capturing device, such as a fingerprints scanner. 
Raw biometrics data is then processed to extract the biometrics features 
representing the characteristics that can be used to distinguish between 
different users. This conversion process produces a processed biometrics 
identification sample, that is stored in a database for future 
identificafion/verification needs. Enrolled data should be free of noise and 
any other defects that can affect its comparison to other samples In the 
second mode, biometrics data is captured, processed and compared against 
the stored enrolled sample. Accoroing to the type of application, a 
yenfication or identification process will be conducted on the processed 
sample. 

The verification process conducts one-to-one matching by comparing the 
processed sample against the enrolled sample of the same user For 
example, a user is authenticated at login by declaring his identity by entering 
his login name. He then confirms his identity by providing a password and 
biometrics information, such as his signature, voice passworo, or fingerprint 
To venfy the identity, the system will compare the user's biometrics data 
against his record in the database, resulting with a match or non-match The 
identification process matches the processed sample against a laige number 
of enrolled samples by conducting a 1 to N matching to identify the user 
resulting in an identified user or a non-match. 

Regardless of the biometrics system employed, the following metrics must 
be computed to determine the accuracy of the system: 

1. False Acceptance Rate (FAR), the ratio between the number of 

occurrences of accepting a non-authorized user compared to the 

number of access trials. 
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2. False Rejection Rate (FRR), the ratio between the number of false 
alarms caused by rejecting an authorized user compared to the 
number of access trials. 

3. Failure to Enroll (PTE), the ratio characterizing the number of times 
the system is not able to enroll a user's biometrics features- this 
failure is caused by poor quality samples during enrollment mode 

4. Failure to Capture (FTC), the ratio characterizing the number of times 
the system is not able to process the captured raw biometrics data 
and extract features from it; this occurs when the captured data does 
not contain sufficient information to be processed. 

FAR and FRR values can vary significantly depending on the sensitivity of 
the biometrics data comparison algorithm used in the 
verfflcation/identfflcation mode; FTE and FTC represent the sensfflvity of the 
raw data processing module. 

Inordertotunethe accuracy of the system to its optimum value, it is 
important to study the effect of each factor on the other. Figure 1 shows the 
relation between FAR and FRR for a typical biometrics system If, he 
system is designed to minimize FAR to make the system more secure FRR 
will increase. On the other hand, if the system is designed to decrease FRR 
by increasing the tolerance to input variations and noise, FAR will increase 
For the system indicated in Figure 1, the point E where FAR and FRR reach 
approximately low equal values, represents the optimum tuning for this 
system. 

The utilization of biometrics technology has been limited to identity 
verification in authentication and access control systems. Hence, important 
secunty appiications such as intmsion detection systems have been left out 
oHh,s technology. There are .wo reasons for this. First, most biomefrics 
systems require special hardware device fer biometrics date co.iecSon and 
reacts fre r US6 ta ne(Works mgmm ^ prov . de (hem 

the systems irrelevant for a significant number of remote usere, who operate 
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outs.de of these network segments. Second, most biometrics systems 
requ,re active involvement of the user who is asked to provide data samples 
that can be used to verify h is id entity. T his e xcludes t he possibility o f 
pass,ve monitoring, which is essential for intrusion detection. There are also 
a number of secondary obstacles to the use of biometrics for intrusion 
detection such as whether the technology allows dynamic monitoring, or 
real-time detection. 

Keystroke dynamic biometrics: 

A popular biometrics system that escapes some of the limitations of 
behavioral biometrics is keystroke dynamics biometrics. Keystroke 
dynamics doesn't require special hardware for data collection (a regular 
keyboard is enough). Under certain circumstances it can be used for 
dynamic monitoring. The traditional keystroke technology, however, doesn't 
allow passive monitoring as the user is required to type a predefined word or 
set of words that is used to identify him. The dwell time and the flight time 
for keyboard actions is then measured. Thereafter, a set of so-called 
d,graphs, tri-graphs or n-graphs is constructed and analyzed to produce a 
d,st,nctive pattern. User authentication and classification are the most 
suitable applications for such technology. 

Mouse dynamic biometrics: 

Previous work on mouse dynamics have, so far, been limited to user 
-nterface design improvement. Studies have been conducted to establish 
the applicability of Pitts' law in predicting the duration of a movement to a 
target based on the size of the target and the distance from the starting point 
to the target. According to Fitts' law, the mean movement time for a 
movement with distance A to a target with width W is as follows- 
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MT = a + b(log 2 (2A/W)) where a and b are empirically determined 
parameters. 

In experiments focused on graphical user interface design mouse cursor 
movements were measured to assess psychological responses in patients. 
A specific user interface was used to force the user to do specific 
movements. The user was asked to move the mouse from specific point 
approaching a specific object located at a certain distance. The study took 
into consideration the effect of movement direction and the object size. The 
study allowed the understanding of several user interface properties related 
to the shape, size, location, and preferred angle of approach of the target 
object. 

It is an objective of the invention to overcome the deficiencies of the prior art. 
Summary of the Invention: 

The present invention provides a system and methods for computer user 
profiling based on behavioral biometrics. The approach consists of 
establishing.distinctive profiles for computer users based on how they use a 
motion-based input device such as, but not limited to, a mouse and/or a 
keyboard. The profiles computed in the present invention are more accurate 
than those obtained through the traditional statistical profiling techniques, 
since they are based on distinctive biological characteristics of users. 

The present invention allows passive, dynamic, and real-time monitoring of 
users without the need for special hardware - it simply requires a motion- 
based ,nput device, such as a standard computer mouse or keyboard for 
data collection. Mouse and keystroke dynamics biometrics are two related 
technologies, that complement each other. 
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In one embodiment of the invention, a behavioral biometrics-based user 
verification system for use with a motion-based input device is provided. 
The system comprises a data interception unit for receiving inputs from a 
user, a behavior analysis unit operatively coupled to the data interception 
unit, and a behavior comparison unit operatively coupled to the behavior 
analysis unit. The system translates behavioral biometrics information into 
representative data, stores and compares different results, and outputs a 
user identity result. 

In one aspect of the invention, the user verification is suitably configured for 
dynamic monitoring. 

In another aspect of the invention, the user verification is suitably configured 
for passive data collection. 

In another aspect of the invention, the user verification system is suitably 
configured for real-time monitoring. 

In another aspect of the invention, the user verification further comprises 
secure communication protocols operatively coupled to the data interception 
unit. 

In another aspect of the invention, the user verification system the data 
interception unit is configured to identify data from a mouse as one of 
movement, drag and drop, point and click, and silence, such that in use, the 
system receives data from a mouse. 

In another aspect of the invention, the user verification system the data 
interception unit is further configured to characterize movement based on at 
least one of average speed, average traveled distance, and direction of 
movement. 
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In another embodiment of the invention, the data interception unit is 
configured to identify actions from a keyboard on the basis of dwell time and 
flight time such that in use, the system receives data from a keyboard. 

In another aspect of the invention, data interception unit is further configured 
to identify action from a mouse as one of movement, drag and drop, point 
and click, and silence, such that in use, the system receives data from a 
mouse and from a keyboard. 

In another aspect of the invention, the data interception unit is further 
configured to characterize mouse movement based on at least one of 
average speed, average traveled distance, and direction of movement. 

In another embodiment of the invention, a method of characterizing a user 
comprises the steps of moving a motion-based input device, collecting data 
from the device, processing the data, and modeling the data using suitably 
selected algorithms to develop a signature for a user. 

In one aspect of the invention, the method further comprises comparing the 
signature with a signature of an authorized user. 



In another aspect of the invention, the method further comprises filtering the 
data after processing and before modeling to reduce noise. 

In another aspect of the invention, the method further comprises passively 
collecting data. 



11 



r 



WO 2004/097601 PCT/CA2004/000669 



In another aspect of the invention, the method further comprises collecting, 
processing and modeling the data in real-time! 

In another aspect of the invention, the method is further characterized as 
moving a mouse, collecting data from the mouse, processing the data, and 
modeling the data using suitably selected algorithms to develop a signature 
for a user. 

In another aspect of the invention, the collecting data further comprises 
characterizing movement based on at least one of average speed, average 
traveled distance, and direction of movement. 

In another embodiment of the invention the method is further characterized 
as using a keyboard, collecting data from the keyboard, processing the data, 
and modeling the data using suitably selected algorithms to develop a 
signature for a user. 

In one aspect of the invention, the collecting data is further comprises 
characterizing movement based on flight time and dwell time. 

In another aspect of the invention, the method further comprises collecting 
data from a mouse, processing the data and modeling the data using 
suitably selected algorithms to develop a signature for a user based on both 
mouse and keyboard data. 

In another aspect of the invention, the collecting data further comprises 
characterizing movement based on at least one of average speed, average 
traveled distance, and direction of movement. 
List of Figures: 
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The invention will be better understood with reference to the following 
figures: 

Figure 1. Tuning the system for best accuracy by studying the relation 
between FAR and FRR. 

Figure 2: Detector architecture in accordance with an embodiment of the 
invention. 

Figure 3. M ouse d ynamics d etector a rchitecture in accordance w ith an 
embodiment of the invention. 

Figure 4. Example of data generated from the interception unit. 
Figure 5. Neural network used in the behavior modeling stage. 
Figure 6. The log-sigmoid transfer function. 

Figure 7. Determining the training stop point for curve approximation neural 
network. 

Figure 8. Mouse signature reproducibility. 

Figure 9. Comparing mouse signatures. 

Figure 1 0. Average speed for different movement directions. 

Figure 1 1 . Histogram of the directions of movement. 

Figure 12. Average speed for different types of actions. 
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Figure 1 3. Histogram of the types of actions. 

Figure 14. Comparing traveled distance histograms. 

Figure 15. Comparing elapsed time histograms. 

Figure 16. Implementation of the detection neural network. 

Figure 17. Neural Network used for behavior classification. 

Figure 18. Experiment hardware setup. 

Figure 19. Neural network training curve for the first user. 

Figure 20. Neural network model used in the detector. 

Figure 21 . Tri-graph based analysis. 

Figure 22. Example on how to approximate unavailable digraphs. 
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Detailed description of the invention: 

There a re two embodiments o f the system of the p resent invention a s 
shown in Figure 1, The first is keystroke dynamics and the second is mouse 
dynamics. These both record movement related to the use of the article 
under normal conditions of operation. 

Keystroke dynamics: 

This biometrics measures the dwell time (the length of time a key is held 
down) and flight time (the time to move from one key to another) for 
keyboard actions. After these measurements are collected, the collected 
act,ons are translated into a number of digraphs or tri-graphs and are then 
analyzed in order to produce a pattern. In access control applications the 
extracted group of digraphs and tri-graphs are pre-defined since the user is 
asked to enter a paragraph containing them. In intrusion detection 
applications, however, this scenario is not applicable. Detecting the 
behavior from an unexpected set of digraphs requires large amounts of data 
to be collected in the enrollment mode so as to cover a higher percentage of 
the captured data in the verification mode! Regardless of the application an 
algorithm generates a Keystroke Dynamics Signature (KDS), which is used 
as a reference user profile. To construct the KDS, we use a key oriented 
neural network based approach, where a neural network is trained for each 
keyboard key to best simulate its usage dynamics with reference to other 
keys. We also propose a technique which can be used to approximate a tri- 
graph value based on other detected tri-graphs and the locations of the keys 
wrth reference to each other, aiming to minimize the failure to compare ratio 
(FTC) and to speed up the user enrollment process. 
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Mouse dynamics: 

Selected mouse actions generated as a result of user interaction are 
compared with a graphical user interface. The data obtained from these 
actions are then processed in order to analyze the behavior of the user. 
Mouse actions include general mouse movement, drag and drop, point and 
click, and silence (i.e. no movement). The behavioral analysis utilizes neural 
networks and statistical approaches to generate a number of factors from the 
captured set of actions; these factors are used to construct what is called a 
Mouse Dynamics Signature (MDS), a unique set of values characterizing the 
user's behavior over the monitoring period. Some of the factors consist of 
calculating the average speed against the traveled distance, or calculating 
the average speed against the movement direction. Presently up to seven 
factors that exhibit strong stability and uniqueness capability are reported, 
however, more may be considered. The detection algorithm calculates the 
significance of each factor with respect to the other factors in the same 
signature, and with respect to its corresponding values in other users 
signatures. A neural network is trained for each enrolled user resulting 
different detection scheme to be used for each of them. 

Architecture: 

Figure 2 depicts the architecture of the detector. The detector is 
implemented as client/server software. The client module, which runs on the 
monitored machine (e.g. potential victim), is responsible for mouse 
movement and keystroke data collection. These data are sent to the server 
software, which runs on a separate machine. The server software is in 
charge of analyzing the data and computing a biometrics profile. The 
computed profile is then submitted to a behavior comparison unit, which 
checks it against the stored profiles 
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For remote users, the approach consists of either providing them with 
remote login software or extending secure remote login software such as 
Security Shell (SSH). The administrator then requires that users use this 
particular remote login implementation for remote access. 

It is common practice in most organizations that remote access be regulated 
by a defined and strict policy. In order to ensure that only users abiding by 
this policy access the monitored network, the biometrics detector is extended 
with a network traffic analyzer that monitors both attempted and established 
connections to the target machine. A connections list established by the 
traffic analyzer is compared against the active users list maintained by the 
core biometrics detector, and possible discrepancies are then reported as 
intrusions to the security administrator. This applies even when the data 
collection module is installed on the target machine. 

If the network analyzer detects resource usage on the target machine while 
there is no biometrics data collected during a session, this will raise the 
possibility that corresponding network traffic is due to a malicious process 
which is not being executed by a legitimate user. On the other hand if the 
biometrics detector is able to monitor activities on the target machine while 
the network analyzer failed to detect the network traffic resulting from such 
activities, this will raise the possibility that the attacker managed to modify 
the behavior of the running application. 

A key issue concerns the protection of the biometrics data collected from 
forgery. To ensure that an intruder cannot intercept and modify the collected 
data, secure communication protocols for client and server interactions are 
used. Forgery can still happen by observing the biometrics generation 
process or by stealing biometrics samples. In the particular case of mouse 
and keystroke dynamics forgery by. observation is extremely difficult to 
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achieve. For each machine connected to the protected domain the 
administrator may enforce the following policy: 



There is NO rexec or telnet access to this machine 

TZZZ rl ° 9in or rsh access to this — - *- 

FTP is NOT secure and may be removed from this machine in the 
near future. 

To access this machine " remotely, use Secure Shell protocol 2 
(SSH2) , secure FTP (SFTP) . and / or Secure Copy Protocol (SCP) 
Bxo diont version 1.0 should be running on the remote side in 
order to access the machine remotely. 

Software available on this machine is Usted at: 
http://web Domain /cn mputlna/soft-ware .shtml 

Use of this facility must adhere to:- Policy 6030: Organization 
Computing and Telecommunications User Responsibilities' 
http://Web_Domain/ Policies/pol6000/6030CTOR.„tml ^ 

•Organization standards for Professional Behavior-, 

http://Web_Domain/policy/professional-behaviour.html 

Note that this machine will usually be rebooted at the end Qf 

every month. Please schedule your jobs accordingly. 



System Administrator: admin 



Apr 04 2004" 



Mouse action can be classified as, for example, but not limited to, one of the 
following categories: 

1. Movement (General Movement) 

2. Drag and Drop (the action starts with mouse button down, movement, 
then mouse button up) 

3. Point & Click (mouse movement followed by a click or double click) 

4. Silence (No Movement) 



Different approaches are used in each category to collect the factors 
characterizing it Some examples of the type of facto,* collected from each 
analysis are the following: 
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- Calculating the average speed against the traveled distance. 

- Calculating the average speed against the movement direction (eight 
directions are considered). 

- Calculating the average traveled distance for a specific period of time 

with respect to different movement directions. From such data we can 

build a usage pattern for the different directions. 

For each factor, the reproducibility and discrimination capability is then 
determined. 

Data Acquisition and Processing 

Figure 3 shows a mouse dynamics detector system, generally referenced as 
10. The system 10 consists of three units: a Data Interception Unit 12, a 
Behavior Analysis Unit 14, and a Behavior Comparison Unit 16. The 
detector 10 translates biometrics information into representative data, stores 
and compares different results, and outputs the user identity verification 
result. 

The Data Interception Unit 12 is responsible for transparently intercepting 
and converting all mouse movements and actions into meaningful 
information. It continuously feeds the Behavior Analysis Unit 14 with the 
processed data. The Behavior Analysis Unit 14 is responsible for analyzing 
the received data, identifying working sessions, and modeling the data to 
produce the MDS. The functionality of the Behavior Analysis Unit 14 
changes according to the operation mode. In the enrollment mode, it works 
on data from different sessions to produce the reference MDS for the user. 
In the verification/identification mode, this unit generates the MDS for the 
user during the detected session. 

The Behavior Comparison Unit 16 is responsible for comparing the 
generated MDS to the reference MDS of the user. This unit maintains a 
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database of all reference signatures calculated for all known system users. 
This database is used for the user identification/verification purpose. The 
Behavior Comparison Unit 16 uses specific comparison algorithms for 
different MDS factors. The output of the unit is a ratio representing the 
difference between the detected behavior and the reference one. The 
higher this ratio is, the more confident the system is that the signature is for 
the same user. Other security modules (e.g. intrusion detector) for different 
security needs can use this ratio as a biometrics suspicion ratio on the 
identity of the user. 

The first step in the detector 10 is to monitor the mouse actions. Running a 
process in the background that hooks all mouse actions transparently, 
without affecting the application receiving the actions, accomplishes this! 
The data collected are a list of actions, for example, but not limited to mouse 
move event, left button down event, or left button up event. Such events do 
not provide meaningful information that can be used in analyzing the 
behavior. Consequently, it is the responsibility of the interception software to 
translate those events into meaningful actions. For example, a set of actions 
that is considered to be a good input to the behavior analysis unit could be 
represented by the following series of events, measured in milliseconds: 

- a mouse movement from a position to another position, 

- followed by a period of silence, 

- followed by another mouse move ended by a click or double click. 
The interception software also detects the direction of movement for each 
generated movement action. Eight movement directions are considered in 
the data interception unit 12 software. The interception software will 
continuously feed the behavior analysis unit 14 every time mouse actions 
are detected on the monitored workstation 18. An example of the produced 
record contents is the type of action, the movement direction, the traveled 
distance, and the elapsed time in milliseconds. Figure 4 shows an example 
of the intercepted data. The x-axis represents the traveled distance and the 
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y-ax.s represents the movement speed. Each point on this figure represents 
an .ntercepted mouse action. For simplicity of the example the effects of the 
type of action and movement direction are ignored. Thus, this curve gives a 
general idea of how the user mouse movement speed is affected by the 
distance traveled. The data interception u nit 12 d eals d irectly with the 
mouse 20. 

One of the parameters affecting the accuracy of this detector is the desktop 
resolution. If the reference MDS has been calculated on a specific resolution 
while the detection process has been done on a different resolution, this will 
affect the range of the data collected and will be reflected on the results 
Another parameter is the operating system mouse pointer speed and 
acceleration settings. Any changes to these settings can affect the 
calculated figures and also affect the user behavior itself while dealing with 
the mouse input device. As an example, if the mouse pointer speed is slow 
the user will need more than one action to move the pointer along a 
d,stance, whereas a single action at medium speed may be all that is 
required to move the same distance. The mouse button configuration will 
also affect the detector 10. In order to achieve reproducible results, variable 
factors should be fixed for each user on a specific workstation 18. 

Session identification 

As the behavior analysis unit 14 receives input from the data interception 
umt 12, the data will be processed in batches. Each batch consists of a 
number of monitored actions. A number of parameters are used in this 
process: 

- Session start is determined if an action is received for a specific user 
and there were no current sessions in effect for this user. 

- Session end is determined if the current active session length 
reached the maximum limit, or the number of recorded actions in this 
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session exceeded the maximum limit. This limit is calculated based 
on several factors; it can be calculated per user, depending on the 
average number of actions the user produced in a period of time. 

A session tag is associated with each session. This tag contains information 
on the session such as, but limited to, user name, machine name, internet 
protocol address, start time/date, and end time/date. This module maintains 
a small database for the current recognized sessions. In the enrollment 
mode, a number of sessions for the same user will be stored in this 
database. These sessions will be used by the behavior modeling stage to 
generate the user's reference behavior. In the verification/identification 
mode a recognized session will be kept in the database until it is processed 
by the behavior modeling stage. 

After the collected data has been converted into sessions, the data are 
filtered to decrease noise resulting from both human and machine sources 
Thereafter, the behavior modeling module processes the batch of actions to 
generate the MDS. For example, Figure 4 shows the traveled distance 
against movement speed data before the filtration process took place Two 
filters were applied before sending the data to the behavior modeling stage 
The first filter restricted the input data to a specific range, eliminating any 
data above or below that range, for example restricting the distance range 
from 25 pixels to 900 pixels. The second filter eliminated any reading on the 
y-ax.s that was determined to be highly deviant from the mean of its adjacent 
points. 

Behavior Modeling 

The output of the noise reduction stage was examined and compared to the 
output for different sessions for the same user in order to find a pattern 
characterizing the graph. In order to automate the detection process 
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however, the date were formalized. Various statistical analysis packages 
can be used to achieve this goal, according to the characteristic of the factor. 
In the present example of the traveled distance against movement speed 
factor (see Figure 4), a Neural Networks was used to approximate the 
collected data to a curve that could be used to identify the user behavior. 
One of the most common uses of neural networks is function approximation. 
It was shown by Hecht-Nielsen that for any continuous mapping of f with n 
inputs and m outputs, there must exist a three layer neural network with an 
input layer of n nodes, a hidden layer with 2/7+7 nodes, and an output layer 
with m nodes that implements f exactly [Hecht-Nielsen 1987]. According to 
those results, it was postulated that neural networks can approximate any 
function in the real world. Hecht-Nielsen established that back propagation 
neural network is able to implement any function to any desired degree of 
accuracy [Hecht-Nielsen 1989]. 

A feed-forward multi-layer perceptrons (MLP) network was employed for the 
neural network. MLP is one of the most popular network architectures; it is 
widely used in various applications. The network is depicted in Figure 5 and 
consists of a number of nodes organized in a layered feed-forward topology. 
The feed-forward topology consists of an input layer, an output layer and 
one hidden layer. 

All connections between nodes were fed forward from inputs toward outputs. 
The MLP network used a linear Post Synaptic Potential (PSP) function; the 
PSP function used was the weighted sum function. The transfer function 
used in this network was the log-sigmoid function. The function generated 
outputs between 0 and 1 as the neuron's net input went from negative to 
positive infinity (see Figure 6). 

A linear transfer function was used for the input and output layers to allow 
the expected input and output range. For faster training, the network wss 
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initialized with the weights and biases of a similar network trained for a 
straight line. 

The output of the neural network was described by the following equation: 

-b 2 



y= 



' 1 

7T K 



Where m and b,j represent the weights and biases of the hidden and output 
layers respectively, x is the input to the network, and N represents the 
number of nodes in the hidden layer (which is set to N=5 in our design). 

The back propagation algorithm was used to train the network. The back 
propagation algorithm searched for the minimum of the error function in 
we.ght space using the method of the gradient descent. The error criterion 
of the network was defined as follows: 

Where w represents the network weights matrix and p is the number of 
mput/output training pairs set. Weights were adjusted during the training 
tnals until the combination of weights minimizing the error criterion were 
found. This set of weights was considered a solution for the learning 
process . The back propagation learning rule, which calculates the weight 
increment, was described as follows: where v is a trial 

independent learning rate, and S } is the error gradient at node/ 

During the behavior modeling stage, the neural network was trained with 
filtered collected data. Input vectors and their corresponding target vectors 
were used. The back propagation-training algorithm was used to train a 
network until it could approximate a function describing the collected data. 
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The training approach may involve the curve over-fitting problem. In order to 
avo.d the over-fitting problem, first the right complexity of the network was 
selected. A network with a single hidden layer containing five perceptions 
was sufficient to produce a good result. Training of the network must be 
validated against an independent training set At the beginning of the 
framing, the training error and the validation error decreased until it reached 
a point where the validation error started to increase. This point is the stop 
pent (corresponds to point A in Figure 7). The stop point is where the 
training should stop to obtain the desired generalization. 

After the network-training curve reached the stop point, the network was fed 
w,th a test stream presenting the spectrum of the input data. The result was 
a curve approximation of the training data. This curve was considered as a 
factor in the MDS for this user. 

Figure 8 shows examples of mouse signatures calculated for the same user 
over a number of sessions. Notice that the curves are very close and that 
the deviation from their average is low. An approach for calculating the 
reference mouse signature was to use the average from a number of 
sess.ons as a reference. Large deviations between different sessions would 
show that the training is not completed properly. This provides an indication 
that there is need for tuning. 

Determination of the proper detection session period is an important factor to 
consider. The aim is to minimize the detection session without affecting the 
accuracy of the system. 

After the generation o f the m ouse signature, which represents » he u ser 
behevior, an important concern is how to discriminate between users based 
on the generated information. The function of the Behavior Comparison Unit 
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16 is to compare the calculated factors (Mouse Signature) against a 
reference signature for the same user. 

Figure 9 gives an example of the comparison process. The two curves in 
Figure 9a were for the same user. Notice that the two curves are close to 
each other and that the difference between the curves is low. Figure 9b 
shows two curves for two different users. The difference between the curves 
is high, which indicates a high difference in the behaviors and a high 
possibility that they belong to two different users. 

The comparison technique used for this factor was to calculate the sum of 
the absolute difference between the curves. If the result is higher than a 
threshold, then those curves belong to two different users. The threshold 
can be determined for each user during the enrollment phase, when the 
reference mouse signature is generated. 

Movement Speed compared to Traveled Distance (denoted MSD) factor had 
strong discriminating and reproducibility capability. Consequently, the MDS 
could be based on this factor, however basing the MDS on the combination 
of several of these factors tends to yield better performance. 

The analysis of the impact of the direction of movement (MDH) involved two 
kinds of studies. First, studying the relation between the direction of 
movement and the movement speed (denoted as MDA). Second, studying 
the population of actions with respect to the movement direction, measured 
by calculating the percentage of actions in each of the recognized eight 
directions of movements compared to the total number of actions in a 
session. 

Figure 10 shows the distribution of average movement speed against the 
direction of movement for two different users. Solid lines represent a 
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number of sessions for the first user. Dotted lines represent the second 
user's sessions. Notice that horizontal movements (directions 2,3, 6, and 7) 
were performed with higher speed than vertical movements (directions 1,8,4, 
and 5). 

Figure 11 shows the histograms of the performed actions in each direction. 
Notice that some directions gained more actions than others. Furthermore, 
there was usually a direction that consumed more actions than all other 
directions. The figure shows the distribution for two different usere: user 2 
performed more actions in the 3* direction, while user 1's actions dominated 
more in the 4 th direction. The ratios between curve points were 
approximately constant for each user, indicating high reproducibility for this 
factor. 

MDA and MDH factors were each represented by eight numbers to be added 
to the user's signature. The amplitude of those numbers, and the ratio 
between them produced meaningful information toward behavioral user 
identification. 

Type of action analysis is based on the fact that the type of action the user is 
performing affects his behavior. Three types of movements were 
considered: point and click (PC), drag and drop (DD), and regular mouse 
movement (MM). Similar to the direction of movement study, the type of 
action was studied with respect to the movement speed (denoted ATA) and 
the distribution of the performed actions over the three types of actions 
(denoted ATH). Figure 12 shows the relation between the movement speed 
and the type of performed action for the three recognized types of actions. 
Two components were extracted from the curve: the range of each type of 
action, and the ratio between the entries. It is possible to rely on this factor 
for identification if the ratio between the entries is constant. For example 
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the speed of movement for user 2 in figure 12, was at its lowest level for the 
point and click type of action compared to other types of actions. 

Figure 13 shows the histogram of the types of actions for a number of 
sessions for two different users. Behavior differences were easily detected 
for the two users and values and ratios between entries were easily 
identified. The following facts were extracted from the curves: 

- User 1 performed a very low number of regular mouse movements 
and depended mostly on point click and drag drop types. 

- User 2 performed a very high number of regular mouse movements, 
and a very low number of point and click actions. 

The reproducibility of this factor was high. Additionally, it was relatively 
umque to the user. The information extracted from the analysis was very 
helpful for the detection module to differentiate between the behavior of 
users. 

The histogram of the traveled distance (denoted TDH) illustrates how the 
user performed actions. The number of actions performed with short 
distances was higher than those performed with long distances. 

The distribution of the distances differed from one user to another. Figure 14 
shows a comparison between two users: user 2 depended more on short 
d.stances for performing actions. As the probability of occurrence of large 
d,stances is usually low (below 15%), it is possible to depend only on the first 
two pomts of the curve to represent this characteristic. The reproducibility of 
this factor was found to be high, while its uniqueness was considered 
average. 

The elapsed time is the time used to perform an action. It depends on the 
type of the performed aotion. The study of movement elapsed time 
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histograms (denoted MTH) illustrates how a user's speed varies when he is 
performing some actions. Figure 15 shows the time distribution for two 
users; the measurement unit used was 0.25 second. The curve shows the 
distribution for actions performed in 8 seconds and less, with a 0.5 second 
interval between curve points. From this figure we concluded that the 
reproducibility of this factor was good. In fact, the first two points of the 
curve provided significant behavioral information. 

For example: 

- For user 1, the first point in the curve (0 - 0.5 second) represented 
around 34% of the total number of actions. 

- The maximum population for user 1 happened in the first point on the 
curve, while the maximum for the second user happened in the 
second point (0.5 - 1.0 second). 

The results indicated that the first 3 points of the curve could be used to 
represent this factor in the user global signature (e.g. MDS). 

By studying the data collected from the experiment and analyzing their 
statistical characteristics, the following observations were made: 

1. the reproducibility of each factor of the mouse signature varied, 
depending on the user and the type of factor. Factors with higher 
reproducibility gained more weight in the detection process. 

2. It was noticed that for some users, some factors had a stronger 
discrimination capability than for other users. The uniqueness factors 
with higher reproducibility gained more weight in the detection 
process. 

In order to utilize the observations, the detection technique assigned the 
proper level of significance to each factor according to its reproducibility and 
-ts umqueness. The reproducibility of a factor were detected by analyzing 
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more sessions for the user, while the uniqueness characteristics was 
detected by including a larger number of other users' sessions in the 
comparison process. In other words, the detection algorithm was able to 
bu.ld an identification pattern for each user and utilize all detectable unique 
characteristics to discriminate efficiently between different behaviors. 

The detection approach adopted in this document consisted of using neural 
networks to detect differences between behaviors. Similar neural networks 
approaches have been used successfully in different recognition 
applications, such as face recognition and signature recognition. 

The approach consisted of conducting a different neural network training on 
a per user profile basis. Figure 16 illustrates how the detection process is 
implemented in both the enrollment and detection modes of operation. In 
order to enroll a new user, training data was prepared from previously 
recorded sessions stored in the behavior modeling unit database (see Figure 
3). Second, a neural network was trained and the status of the trained 
network was stored in the signatures database associated with the behavior 
detection unit. 

In the detection mode, the behavior detection unit loaded the legitimate 
user's stored neural network status. The saved status was then applied to 
the network, and the monitored behavior resulting from session analysis was 
applied to the neural network. The output of the network was the confidence 
ratio,-a percentage number representing the degree of similarity of the two 
behaviors. 

The neural network used in the detection process (see Figure 17) was a 
feed-forward MLP network consisting of three layero. The input layer 
cons,sted of 3 9 nodes, which is the total number of inputs representing the 
factors involved in the MDS. The hidden and output layers consisted 
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respectively of 40 and one nodes. The expected output range was from 0 to 
100. Table 2 shows the description of the inputs to the network, which 
consisted of a set of numbers describing the MDS. 



Factor 


Description '■ " 


Inputs 


~MSD 


Movement Speed compared to Traveled Distance 


12 


MDA 


Average Movement Speed per Direction of Movement 


8 


MDH 


Direction of Movement histogram 


8 


ATA 


Average Movement Speed for Action Types 


3 


ATH 


Type of Action Histogram 


3 


TDH 


Traveled Distance Histogram 


2 


MTH 


Movement Elapsed Time Histogram 


3 



Table 2. Examples of Factors involved in a Mouse Signature 



The transfer function of the neural network was a Log-Sigmoid function. The 
output of the network can be defined as follows: 



f 



y = 



l 

2X m s 



~b 2 



Where x,s represent the inputs to the network, and w,j, b,j, and N as defined 
previously. N-1 represents the number of nodes in the input layer. The back 
propagation algorithm was used to train the network. The data prepared for 
network training was designed as follows: 

1. Positive training: data collected from 5 sessions for the user trained 
for an output of 100, meaning 100% confidence in identity. 

2. Negative training: data collected from other users based on 5 
sessions per user with an output of 0, meaning 0% confidence in 
identity. 
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Figure 19 shows the training curve for one of the users; the error level is set 
to be 0.001. The results indicate that the network was able to detect a 
pattern specified only for the user to differentiate his behavior from others. 

Example 1: Experiments involving 22 participants were conducted over 9 
weeks. Participants installed the client software and used their machine for 
the.r routine activities. Mouse and keystroke data were collected 
transparently and sent to a central server. At the end of the data collection 
Phase, we used the collected data to conduct an offline evaluation of our 
detection system. To do so, we divided the participants into 2 groups- a 
group of 10 representing authorized users and a group of 12 representing 
unauthorized users. We computed a reference signature for each member 
of the first group using some of their own sessions. For each legal user we 
used the sessions belonging to the other users (authorized and 
unauthorized) to conduct some masquerade attacks on their reference 
signature. This resulted in a false acceptance rate of 0.651%. 

To evaluate the false positives, for each legal user we compared their own 
remaining sessions (not involved in the computation of the reference 
signature) against their reference signature. This resulted in a false rejection 
rate of 1.312%. 



F.gure 18 shows the hardware setup of the experiment. Client software 
(responsible for monitoring mouse actions) feeds a detection server 
(software) with the monitored data. The client software, which runs as a 
background job, starts monitoring user actions when the user login occurs 
and stops running when the user logout occurs; the software is totally 
transparent and does not affect any other application. 
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The detection server was installed on a local area network and accepted 
connections from local workstations and from outside the network over the 
Internet to allow remote users to participate in the experiment. A large 
number of participants were connected remotely to the network from their 
home computers or from different countries or cities. The server software 
stored the collected data in an internal database, along with the session 
information containing the user ID and other information. 

The hardware configurations of the participating computers varied from P2 
266 MHz to P4 1.5 MHz. The server configuration was a P3 450 MHz with 
256 MB Ram, running the Windows 2000 operating system. The client 
workstations ran different versions of Microsoft Windows operating system 
(Windows 98SE, Windows ME, Windows 2000, and Windows XP). 

Data were collected over a number of 998 sessions on an average of about 
45 sessions per user. We started the experiment with a maximum detection 
period of 20 minutes for the 1 st week, followed by 15 minutes sessions for 
the rest of the experiment duration. The entire experiment lasted 9 weeks. 
The number of recorded actions in a session directly affects the training of 
the neural network. We set the maximum number of actions in a session to 
2000. If the number of actions exceeded this limit, another session was 
created and the newly recorded action would be registered in the new 
session. 

After examining the recorded session data for different users, we noticed 
that some of the users produce much more actions in their active sessions 
than others. Identifying such users is much easier than those who generate 
a lower number of actions. 

For the enrollment process, the first five sessions were used to develop the 
reference signature. We then found that data collected from five sessions 
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was enough to develop the reference MDS for most of the users. To do this, 
we average the resulting signatures for the five sessions to construct the 
reference signature, which is then used in the identification/verification 
mode. 

To simulate real life in our experiment, we randomly divided the participating 
users into two groups: insiders group (10 users/405 sessions) and outsiders 
group (12 users/593 sessions). A reference signature was calculated for 
each user in the first group and stored in the database. Sessions of the 
outsiders' group were used to simulate an attack where the attacker 
signature was not recorded in the database, thereby testing the ability of the 
detection algorithm to target such situations. We conducted the analysis of 
the experiment results in two steps, each addressing one of the two 
hypotheses that have been formulated at the beginning of this section. 

The first p art o f t he analysis w as t o p rove t hat t here w as a detectable 
difference between a user's signature and all other users' signatures in both 
the in siders' a nd o utsiders' g roups. We confirmed t his b y a pplying t he 
behavior comparison algorithm to sessions collected from different users 
against a r eference s ignature o f a g iven u ser. F AR w as c alculated b y 
conducting this test for all available reference signatures of all the users in 
the insiders' group. False acceptance was established if the resulted 
confidence ratio was over 50%. Fifty sessions out of the 405 sessions of the 
insider group were dedicated for computing reference signatures for the 10 
members (5 sessions per user). For each member in the insider group the 
remaining insiders' sessions minus his own sessions were used to conduct 
insider attacks against him, which corresponds to a total of 3195 (=355x10- 
355) insider attacks. For each user in the insider group, the totality of 
sessions in the outsider group was used to simulate outsider attacks, which 
corresponds to a total of 5930 (=593x10) outsider attacks. Hence, 9125 
(=5930+3195) masquerade attacks against the insider group were 



34 



WO 2004/097601 

PCT/CA2004/000669 



simulated, Masqueraders are (malicious) users impersonating different 
(legitimate) users [Anderson 1980]. 

To illustrate the detection process, Table 3 shows a sample training data for 
five different users. The sample data consists of four factors covering five 
sessions per user. The output shown was set to train the network for the 
first user. Figure 19 shows the training curve for the first user, indicating its 
ability to differentiate between this user and others. To simulate the FAR 
calculation process, Table 3 shows the confidence ratio for all the included 
sessions after the network has been trained for the first user. Table 4 shows 
signatures for one insider (User 5) and two outsiders masquerading as User 
1. The insider's signatures shown are different from those used in the 
network training; the corresponding confidence ratio is also shown in the 
figure. After running all the comparisons, we computed the false acceptance 
rate as follows: FAR-jfL wnere ^ was the number of false acceptance 
and N fo the total number of tests. At 50% threshold, we obtained in our 
experiment FAR-0.00651, for N fa = 9125 attacks attempts. 

An analysis of legal connections was conducted only on the insiders' group 
in which all reference signatures were already calculated for all the group 
members. The sessions of each member of the insider group, which were 
not .nvolved in the calculation of the reference signature, were applied to the 
detection algorithm. A total of 355 (- 405-50) legal connections were 
s,mulated for the whole group. A false rejection was established if the 
confidence ratio was below 50%. Table 5 gives an idea of the FRR 
calculation process. The figure shows a sample signature for 15 sessions 
for the same user (user 1), and the confidence ratios computed using his 
trained neural network. 
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ATH 


ATA 


USD 


NN 


CR 


821 1624 1123 624 11.742 21.72 1125 1125 
. 8 - 93 1523 11.20 1123 1421 2121 620 1128 

1122 1324 926 429 1429 2043 926 1429 

822 122 927 1078 1120 2122 820 * 1621 
828 1423 828 - 821 1923 2121 7229 1026 


4823 2778 2327 
51.13 34 *3 aa in 
8122 4223 6281 
62.95 3541 1145 
5028 3522 1441 


1712 3326 11124 
8020 2129 5423 
12228 2429 73.11 
10321 2341 7727 
146.19 3828 9346 


16.63 16.16 1521 "12.18 829 
2543 1921 1524 1242 822 
2923 1518 1420 1124 551 
29.19 18 1521 1127 527 
2122 17.77 14.72 1023 523 


100 
100 
100 
100 


99296 
100 
100 
100 - 


7.74 1126 1321 6.10 1221 16.19 1227 1924 
1428 1122.. 628 10.45 12.75 1623 13.01 1428 
16.1 12.74 927 921 12.74 1442 1420 925 
1526 1048 10 823 1341 1429 1528 12.19 
1721 1222 12.11 927 821 1947 . 920 " 973 


3822 023 6079 
3429 122 6423 
3Lo4 6430 
3423 52S 5926 
4524 6.17 47.74 


19822 234 20937 
17028 252 18441 
261.12 244.7 18127 
239.63 8823 20225 
225.78 64.76 23522 


19.75 18.19 1521 920 826 
2044 1922 1521 1124 012 
18.17 1572 1346 1126* 821 
1824 1420 1320 1016 7.13 
1570 1323 1127 928 7.19 


100 
0 
0 
0 
0 


02207 
02208 

0 

0 


.1020 1422 722 920 1321 2121 13.11 1027 
1423 1125 820 1328 1429 1429 11.13 112S 
1124 12.03 823 14.12 1421 1944 879 1028 
10.76 - 923 . 1220 1021 1524 1222 142 14.10 
12.11 9.79 1025 13.14 1424 1520 927 1520 


47.77 1943 3225 
4223 2023 3830 
3925 1825 4228 
4225 1974 3724 
3929 1443 4521 


17721 1342 92.11 
19275 7224 1115 
18024 10272 10027 
20722 83.16 13729 
194.16 8727 11347 


18.04 1741 1524 13.10 1024 
1677 1824 1528 1125 12.15 
2027 1879 1642 1321 1127 
17.72 1728 1525 1228 725 
2020 1825 1521 13.78 822 


0 
0 
0 
0 
0 


0 

0.0207 
02207 
02207 
02208 


1526 520 8.19 22.16 20 1327 325 102 
14.67 1029 923 10.09 2325 1522 726 822 
1542 447 10.69 1621 2328 12.18 5.72 1024 
1729 820 8.19 1321 2127 10.77 722 12.17 
1224 1021 1229 1520 2148 1229 828 6213 


2429 1327 6128 
2620 1022 6224 
2228 1228 64.17 
2124 1428 6323 
29.63 1120 8821 


176.04 572 8256 
16024 13821 8025 
1527 10322 6022 
130.78 6429 .101.54 
178.7 3521 90.12 


1825 1724 1728 1628 1625 
2225 1921 18.11 1520 14.18 
1925 192S 1925 14.15 1220 
2017 20.14 20.1 .1748 12.16 
2576 1846 1623 1526 1025 


0 
0 
0 
0 
0 


0 
0 
0 

O0210 
0 


1124 7.63 1022 1124 1324 15 . 1127 1726 
. 1501 6.77 944 11.13 1824 12.10 11.13 1549 
1021 .1024 1140 1027 1525 20.15 1028*' 1021 
1120 1224 8.19 821 1878 1729 1012 1229 
11.16 825 873 12.13 2228 1629 722 1228 


4447 1825 3921 
3341 1325 5278 
38.72 17.77 4323 
38.62 132S 4928 
28.1S 10.19 6140 


24573 14848 18722 

253.65 158.71 13225 
26727 1482 16524 
15629 1612 8524 

229.66 192.76 1342 


1721 1678 1342 1270 071 
1725 1520 13.17 1123 1023 
1072 15.73 1327 1152 622 
1826 1826 1541 12.16 .821 
1825 1671 14.14 12.74 828 


0 
0 
0 
0 

. 0 
0 


0 
0 

0.0193 
0 
0 
0 



Table 3. Training data for five different users 





ATH 


ATA 


USD 


CR 


12.17 829 920 1523 1525 1320 11,69 13.60 
.12.02 10.48 071 1048 13.81 2425 722 10.74 
1324 9.74 668* 9.74 20 1427 1021 1425 
1062 829 017 1028 2425. 1228 1008 1525 
7.12 928 1028 1025 2027 1342 1123 1043 


3027 13.12 5626 
3621 1423 4829 
3220 1421 8222 
4741 1007 3324 
4129. 19.17 3945 


22226 169.6 117.15 
235.18 17723 119.68 
2372 15627 10721 
21622 11527/13849 
22527 6611 15472 


1014 1599 14.16 12.71 1027 
1018 1517 14.06 13.78 10 
1846 17.77 1449 1325 1020 
1770 1424 1452 1053 845 
1843 1372 1622 1240 925 


572&07 
624&09 
6135-10 
126&C8 


1345 724 1773. 1345 1029 1621 10.70 9.78 
922 . 102 1776 928 . 074 1475 1628 1224 
1223 611 7.12 2128 1327 1025 .1225 1327 
13,05 . 525 1222 15 1325 1328 . 1321 132 
929 OIB 1020 1626 1121 1424 1515 1323 


723 244 8920 
1326 325 8221' 
16.51 Oil 7228 
1321 828 7944 
22.72 • 1121 65.15 


27223 127 11022 
15022 69246 10725 
2032 68.125 8322 
20229 10523 82229 
17451 87282 10528 


1O10 . 1825 1721 1426 1150 

2228 2035 1728 1549 1223 
2150 2026 1824 1521 10.72 
2074 1723 1829 1501 1052 

2229 22.69 1Q7S 18.75 12.5 


4.11E-07 
143&OS 
129&0S 
143&05 
143&0S 


1571 13.66 5.69 . 728 1526 1528 10251 1062 
2028 13.42 720 017 1625 921 723 1428 
14.70 1425 723 921 .. 1274 . 1724 823 14.70 
1822. 1420 825 . 721 1022 1524 1X66 t023 

1572 14.78 015 11.73 526 1727 1126 1325 


4326 1227 43.73 
39.15 9.172 5145 
3224 823 6857 
4123 1420 4373 


20846 123.09 12027 
20024 77.17 101.11 
20624 6828 10629 
22378 109.17 12728 


1927 1821 1525 1225 1022 
24.10 1611 1577 1022 7218 
2322 1690 1427 1322 0.14 
1724 1828 15 12.7 923 


129&05 
3.77E-05 
017948 
S.03&07 
1.13E46 



«0 
ft. 

I 

I 
J 



I 



Table4. Simulated Attack: One Insider and T^oOutsidereMasquerirfingasUserl 



MOH 



ATH 



ATA 



USD 



1140 
1127 


928 
1140 


1127 
13.75 


1020 
1127 


1543 
1221 


2020 
1721 


1023 
OT2 


1040 
12.24 


10.99 


1729 


013 


679 


12.18 


2125 


727 


12.18 


1226 


1440 


1122 


7.71. 


.1359 


2128 


720 


11.14 


1122 


1079 


1228 


728 


1421 


20.70 


1027 


11.15 


1222 


616 


1721 


1028 


13.77 


1515 


928 


1024 


727 


10.78 


629' 


616 


1224 


2627 


1078 


1621 


13.68 


1028 


627 


617 


1222 


2971 


520 


1425 


1278 


1522 


628 


822 


1623 


17.64 


828 


1144 


12 


9233 


112 


723 


112 


2226 


82 


18 


1270 


1026 


826 


521 


1622 


2026 


826 


1722 


648 


1221 


1048 


628 


821 


2921 


429 


17.13 


1041 


1579 


.825 


825 


1328 


21.18 


620 


1222 


12 


1033 


1023 


'956 • 


1326 


2123 


623 


1528 


102 


1.24 


11.11 


947 


1429 




674 


1420 



5020 

50 
4073 
602, 
4821 
5123 
4827 
4928 
6016 
49.16 
5023 
4841 



4520 422 



3623 
3979 
3924 
4774 
4377 
4823 
4749 
'4928 
4027 
4626 



7.10 
1023 
1456 
620 
1151 
321 
529 
123 
120 
123 
615 
223 
22.04 



10524 3222 
10424 3778 
7729 2527 
10224 2429 
12221 32.13 
8279 2728 
8022 1923 
5728 2425 
113.83 2822* 
7325 26.58 
7665 3123 
4820 2523 
9725 32.16 
7128 2725 
12128 3547 



7223 
54.12 
7142 
8120 
8328 
3744 
6025 
3028 
492 
6016 
725 
618 
6723 
23.76 
93.55 



2446 20.10 

2321 1610 

2720 24.05 

32.15 1925 

28.14 2027 

1729' 1725 

27.12 2221 

2501 1321 

2427 24.04 

1625 1655 

1228 12.19 

1827 1827* 

1427 1529 

3328 2176 

1922 1923 



1524 928 

1428 1O02 

1928 12.66 

13.07 1021 

1529 1022 

1624 1228 

1727 1228. 

1222 10.73 

1126 1120 

028 • 846 
1128 % 8.92 

1727 1024 

1424 1176 

1124 922 

14.18 1223 



520 
'626 
720 
668 
625 
5.44 
525 
6.11 
525 
425 
623 
621 
561 



m 

100 
07.19 . 

100 

103 
97.19 
97.19 
07.19 

100 
07.19 
97.19 
97.19 

100 
97.19 

100 



Table 5. ERR Calculation for User 1 
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MDH 

11.40 



ATH 



ATA 



USD 



CR 



1157 11.40 

10.00 1759 

1256 14.40 

1152 10.70 

12.02 ai6 

7.87 10.76 

1363 10,68 

12.70 1552 

12 0533 

12.70 1056 

8.48 1251 

10.41 15.70 

12 1053 

102 1154 



TTsT 
ia75 

0.13 

1152 

1258 

1751 

859 

6.67 

858 

115 

656 

10.48 

855 

1053 

11.11 



1&50- 
1157 
8.70 
7.71 
756 
1058 
6.18 
6.17 
8.02 
753 
551 



1£01 

12.18 

13.60 

1451 

13.77 

1254 

1252 

1653 

115 

1652 

051 

13.68 

13.60 

1458 



1751 8.72 

21.65 757 
2156 750 

20.70 10.07 
16.15 058 
2657 10.78 

20.71 550 
1754 858 

22.66 85 
20.06 856 
20.81 4.09 
21.18 850 
2153 653 
10.85 8.74 



10.40 
1254 
12.18 
11.14 
11.15 
1054 
16.61 
1455 
11.44 
16 

1752 
17.13 
1252 



48.73 

50.0 

9851 

51.53 

48.07 



50.16 
40.16 
J 0.33 
48.41 
50. GO 
15.68 3053 
14.20 47.17 



43.90 
3850 
3653 
30.70 
39.04 
47.74 
43.77 
48. S3 
47.40 
4058 
40.07 



558 

7.10 

1053 

1458 

850 

1151 

3.01 

5.89 

1.83 

150 

153 

a is 



10554 3252 
104.04 37.78 
77.60 2557 
102.64 2459 
122.61 32.13 
32.79 27.08 
9052 10.03 
5758 2455 
113.03 28.02 
73.85 2658 
70.65 3153 
I 25.83 

97.65 32.16 

71.66 27.05 
12156 35.47 



7253 
54.12 
71.42 
01.80 
8358 
37.44 
60.05 
3058 

4ae 

50.18 
755 

ais 

6753 
23.76 



24.46 
2351 
2750 
32.15 
28.14 
1750 
27.12 
25.01 
2457 
1655 
1258 
1857 
14.07 
J3.03 



20.10 
16.18 
24.05 
1055 
20.07 
17.05 
2251 
13.61 



12.10 
1857 
1559 
21.76 
1053 



15.64 

1458 

1058 

13.07 

15.69 

1654 

17.07 

12.02 

1156 

058 

1158 

1757 

1454 

1154 

14.18 



056 651 

10.02 5.44 

1256 550 

1051 0.03 

1052 750 
1258 6.68 
1256 655 
10.73 5.44 
1150 5.65 
0.46 5.11 
0.02 555 
1054 455 
11.76 653 
052 551 
1253 5.61 



100 

too 

97.10 

100 

100 

97.10 

97.10 

97.10 

100 

97.10 

97.19 

97.10 

100 

37.10 



Table 5. FRR Calculation for User 1 



In the experiment described above, we gave total freedom to the participants 
about which operating environments to use. As a consequence, data were 
collected using a variety of hardware and software systems. Questions 
remained about the impact of these variables on the results obtained. For 
example, what if the perceived difference between the MDS of two different 
users was simply due to the fact they were using different software 
applications? 

In order to answer these questions, we conducted a small experiment where 
seven different users were asked to perform the same set of actions using 
the same machine. More specifically, we developed a fixed user interface 
for the experiment where each user is asked to perform a specific action 
between two rectangles. The process was repeated 100 times per user 
session. In each round the program forces the user to perform the action in 
a specific direction by changing the position of both rectangles; the distances 
between the boxes are equal. The software records the time the user 
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consumes to perform the action. All environment variables were fixed in this 
experiment. 

The first null hypothesis we wanted to prove is that for a mouse signature 
factor if all other environment variables are fixed then similar user behavior is 
observed. Table 6 shows seven different sessions for the same user 
performing drag and drop in the eight recognized directions. The time 
shown is the average time required to perform the action in milliseconds. In 
order to emphasize on the similarity of the readings we calculate chi-square 
for the recorded sessions. We use the 1* session as the expected 
frequency in the chi-square test. Since we were comparing 8 proportions the 
number of degrees of freedom is 7; for this number we have X \ M =18.475. 
From table 6 we noticed that most of the calculated values are tower than 
this value (only one result is slightly above the limit), which means that the 
first null hypothesis is true. 



B 6 7 8 *^g. p — 

" S " M - 33 7SM *"•« ^ i03.se 86 . 62 112 . <7 „ 

X0S.3* M . 71 „. M l01 ., m . 63 ?412 

10..M 88 . 32 „.,. m . s 101 83 . 92 M2 »' - 6 

«... 7 , 68 l23 . u 113 . 35 1196< mM 92 4i - 

9»« 7 2 . 97 123 . 33 10< . 38 9J . eo 9870 9sb9 
107.B, S4.0! „.„ al6lS2 104 e0 . 89 mo 

m.. »a. 96 ...» 121 .33 108 . 52 88 . 01 128 . <7 83- „ 3 izo33 
Table 6. Comparing drag-drop sessions for the same user 

The second null hypothesis we wanted to prove is that there is detectable 
difference between different users, which does not depend on other 
environment variables like hardware and software configuration. Table 7 
shows seven sessions for seven different users; we use the 1 st user session 
as the expected frequency. Chi Square is calculated for the other six users 
The results shown indicate significant differences in the compared 
frequencies proving the second null hypothesis. 



38 



WO 2004/097601 PCT/CA2004/000669 



User 

User 1 
User 2 
User 3 
User 4 
User 5 
User 6 
□ser 7 



1 

106.81 

105.35 

95.76 

187.7' 

91.31 

122 

100.73 



2 


3 


4 


5 


6 


7 


8 


Avg. 


X' 


137.58 


77.09 


128.62 


110.87 


121.69 


146.6 


74.48 


127.13 


0 


95.71 


65.92 


101.8 " 


101.63 


74.12 


94.66 


80.87 


103.59 


58.28 


89.28 


65.15 


103 


97.23 


82.14 


122.52 


73.74 


104.23 


43.54 


142.32 


137.76 


212.5 


196.87 


148.92 


208.87 


153.75 


200.16 


347.49 


138.87 


90.71 


135 


81.28 


85.61 


84.46 


67.14 


108.54 


60.64 


95.44 


83.66 


117.62 


120.06 


88.74 


145.06 


115.40 


127.9 


48.74 


84.76 


63. B4 


107.44 


112.83 


88.17 


108.88 


73.80 


105.99 


45.36 



Table 7. Drag-drop sessions for seven different users 

Keystroke dynamics 

Table 8 shows a combination of tri-graphs generated from three sessions for 
two different users, and the corresponding time used to perform the tri- 
graphs in milliseconds. The tri-graphs shown are centered by the character 
'a* (ASCII code 65). From the table we can notice the similarity between the 
response time for the first user's sessions, we can also notice obvious 
difference in behavior between the two users which can easily be detected 
for some of the tri-graphs (marked in bold). 











87-65-68 


86 


85 


73 


83-65-89 


83 


82 


69 


77-65-78 


76 


70 


60 


70-65-69 


134 


112 


62 


82-65-72 


122 


92 


80 


77-65-78 


74 


76 


68 


87-65-68 


80 


81 


71 


83-65-89 


71 


75 


111 


83-65-76 


62 


62 


59 


83-65-76 


67 


64 


63 


76-65-77 


143 


205 


56 



Table 8. Time used to perform different tri-graphs for two different users 

In access control applications the extracted group of digraphs and tri-graphs 
are pre-defined since the user is asked to enter a paragraph containing 
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them. In intrusion detection applications, however, this scenario is not 
applicable. 

Detecting the behavior from an unexpected set of digraphs requires large 
amount of data to be collected in the enrollment mode so as to cover a 
higher percentage of the captured data in the verification mode. 

Our goal was to design a detection algorithm that generates a Keystroke 
Dynamics Signature or KDS, which could be used as a reference user profile 
and matched against active user profiles to dynamically detect 
masqueraders. 

We propose two different approaches to construct the KDS, a digraph based 
approach which utilizes a single neural network per user, and a key oriented 
neural network based approach, where a neural network is trained for each 
keyboard key to best simulate its usage dynamics with reference to other 
keys. We also propose a technique which can be used to approximate a tri- 
graph value based on other detected tri-graphs and the locations of the keys 
with reference to each other, aiming to minimize the failure to compare ratio 
(FTC) and to speed up the user enrollment process. 

The first approach we propose is a digraph based analysis approach, The 
approach utilizes a neural network to simulate the user behavior based on 
the detected digraphs. The neural network (Figure 20) used for this 
approach is a feed forward multi layer perceptron network. The training 
algorithm is back propagation. The network consists of four layers, input 
layer, two hidden layers, and a single node output layer. 

The input layer consists of N number of nodes where N = 2 x Number of 
Monitored Keyboard keys. Input to the nodes is binary 0 or 1, as each node 
in the input layer represents a key. The 1* N nodes represents the key 
where the action is started at, and the 2 nd N nodes represent the key where 
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the action ends. Each batch of nodes should have only one input set to one 
while the other inputs are set to 0; the node set to 1 represents the selected 
key. 

During the enrollment mode, a batch of M actions will be collected and fed to 
the behavior modeling neural network as a training data. The factor M 
representing the number of actions used for enrollment will be determined 
based on another factor D which represents the percentage coverage of the 
collected digraphs combinations during the data collection process. When 
this percentage reaches a specific pre-defined limit, the collected data can 
be used for the enrollment process. 

A simulation will run after the neural network has been trained with this 
batch. This simulation will consist of a number of non redundant actions 
picked from the enrollment data. The result of this simulation will be stored 
for each user as well as the training data, which will be used also in the 
verification stage. 

A small batch of actions will be used in this stage to verify the user identity; 
this batch will be added to the training batch of the user's neural network,' 
resulting a network with different weights. The effect of the small batch on 
the network weights represent a deviation from the enrollment network. In 
order to measure this deviation, another simulation will run on this network 
with the same batch prepared for the enrollment process for the specific 
user. By comparing the result of this simulation to the enrollment stage 
result, the deviation can be specified. An approach that can be used here is 
to calculate the sum of the absolute difference of the two results, if this 
deviation is low (within a specific limit) then the collected sample is for the 
same user, if not then this sample is for another user. 
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Our second proposed approach is based on tr^rep, analysis, we name Ms 
approach as -Key Onented" approach because H ,s based on assigning a 
^ura, network for each monitored key on me keyboaro. The neura, network 
used ,n this approach is simit* to the one descnbed in me previous secUon. 
The training procedure requires passing the tr^raph start key. end key and 
.be efcpsed time t „ me „. Rgure „ ^ ^ ' * « 

network ,s utilized in the enrollment or detection phases. 

Coverage matrix is a three dimensional matrix which is used to store the 

Keepmg tack of such information helps in dtfferon. areas such as in 
evaluabng the overai, coverage of the enroilmen, process and the 
development of a customized enrolment scenano which can be used In case 

UrT' " 3,50 ^ ** a ~^" '«=hni q ue which is 
explained in the next section. 



ITT !° T° P 3 ,eChn ' qUe '° he ' P minimiZin 8 ,he a ™""< of date 

in^naf ,T enr °" ment ProC8SS ' ne6ded '"0" ^n, the 
information detected so far should be extracted. 



i 



nZnTt Z ' Wh,Ch fe 3 *"° d ' menSfonal maWx ' resents the 
roiabons bet ween me keys and how dose or far they are from each other 

b^Tt T iniHafeed nUmb8re ■""—*« ^ «*- distances 
between the keys on the keyboard. 

Figure 22 iHustra.es how the approximation process Is performed. Lets 
assume that an approximation for the EB Cgraph is needed, we can detect 
^ *«* — * corresponding value in the coverage matrix (Pigut 
22b). The approximation matrix will be used to locate alternative entries (for 

(D 2 ~ ,<>WeSt ^ "° maWx: in «• — i, wii, be 

(D,H) and (G.F) respectively. 
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From this step we can enumerate the tentative approximations, in this case it 
is DG, DH, FG, and FH. In the next step the distance of each combination 
will be calculated from the approximation matrix (underlined numbers in 
figure 22a), where they will be sorted according to their closeness to the 
original distance of the approximated digraph (AppMatrix(EB) = 3). The 
sorted result is (FH, DG, DH, FG). 

The Coverage matrix may be used to make the final decision out of the 
sorted result. The matrix in figure 22b shows only the weights of the 
tentative combinations. Notice that digraph FH has a coverage of 30 which 
means that it is a good candidate (the best fit in this case). The second 
alternative DG also has good coverage, while DH's has a relatively low 
coverage. 
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